McNemar test
The McNemar test is a statistical test used to analyze paired nominal data or matched data collected from two related groups or conditions. It assesses whether there is a significant difference in proportions or frequencies of a categorical outcome between paired observations. The McNemar test is particularly useful when dealing with data that are not independent, such as before-and-after measurements on the same subjects or matched pairs in case-control studies.
Understanding the McNemar Test:
1. Null and Alternative Hypotheses:
-
Null Hypothesis (H0): The null hypothesis typically states that there is no difference in the proportions or frequencies of the categorical outcome between the two conditions.
-
Alternative Hypothesis (H1): The alternative hypothesis suggests that there is a significant difference in the proportions or frequencies of the categorical outcome between the two conditions.
2. Test Statistic:
- The McNemar test statistic is calculated based on the number of discordant pairs, i.e., pairs where the outcomes differ between the two conditions, compared to the total number of discordant pairs plus the number of concordant pairs, i.e., pairs where the outcomes are the same.
- The test statistic follows a chi-squared distribution with one degree of freedom under the null hypothesis.
3. Calculation of Test Statistic:
- Let \(a\) be the number of discordant pairs (where one subject has a positive outcome in the first condition and a negative outcome in the second condition), and \(b\) be the number of discordant pairs in the opposite direction.
- The McNemar test statistic, denoted as \(\chi^2_{M}\), is calculated as: \[\chi^2_{M} = \frac{(b - a)^2}{b + a}\]
4. Interpretation of Results:
- If the calculated \(\chi^2_{M}\) value is greater than the critical value from the chi-squared distribution with one degree of freedom at the chosen significance level (commonly \(\alpha = 0.05\)), then the null hypothesis is rejected, indicating a significant difference between the two conditions.
Applications of McNemar Test:
A. Medical Research:
- In clinical trials, the McNemar test is used to assess whether there is a significant difference in treatment outcomes between two treatment groups or before and after treatment within the same group.
B. Education:
- Educational researchers may use the McNemar test to evaluate the effectiveness of teaching methods or interventions by comparing pre-test and post-test scores of students.
C. Epidemiology:
- Epidemiologists use the McNemar test to analyze matched case-control studies or cohort studies where data are collected from the same subjects at different time points.
D. Psychology and Social Sciences:
- Researchers in psychology and social sciences utilize the McNemar test to analyze paired survey responses, preferences, or behaviors before and after exposure to certain stimuli or interventions.
Considerations:
- The McNemar test assumes that the data are paired and dichotomous.
- It is sensitive to small sample sizes, and caution should be exercised when the number of discordant pairs is small.
- Extensions of the McNemar test exist for analyzing categorical data with more than two categories.
In conclusion, the McNemar test is a valuable tool for analyzing paired nominal data and is widely used in various fields including medical research, education, epidemiology, and psychology. By comparing paired observations, researchers can assess the significance of differences in proportions or frequencies, providing valuable insights for decision-making and inference.
Example problem to demonstrate the calculation of the McNemar test.
Suppose we’re conducting a study to evaluate the effectiveness of a new teaching method for improving students’ performance in a mathematics exam. We collect data from 50 students who were administered a pre-test (before the teaching method was introduced) and a post-test (after the teaching method was introduced). Each student’s performance is categorized as “pass” or “fail” in both the pre-test and post-test.
Here’s a summary of the data:
- In the pre-test, 25 students passed and 25 students failed.
- In the post-test, 35 students passed and 15 students failed.
- Among the students who passed the pre-test, 20 also passed the post-test.
- Among the students who failed the pre-test, 15 passed the post-test.
We want to determine if there is a significant difference in the proportions of students passing the exam before and after the teaching method was introduced.
Calculation of McNemar Test:
-
Create a Contingency Table:
Passed Pre-test (Yes) |
20 |
5 |
Passed Pre-test (No) |
15 |
10 |
-
Calculate the McNemar Test Statistic:
-
\(a\) = Number of discordant pairs where a student passed the pre-test but failed the post-test = 5
-
\(b\) = Number of discordant pairs where a student failed the pre-test but passed the post-test = 15
\[\chi^2_{M} = \frac{(b - a)^2}{b + a}\] \[\chi^2_{M} = \frac{(15 - 5)^2}{15 + 5} = \frac{100}{20} = 5\]
-
Determine the Critical Value:
- With 1 degree of freedom and \(\alpha = 0.05\), the critical value of the chi-squared distribution is approximately 3.84.
-
Compare the Test Statistic to the Critical Value:
- Since \(\chi^2_{M} = 5 > 3.84\), we reject the null hypothesis.
Interpretation:
Based on the McNemar test, we conclude that there is a significant difference in the proportions of students passing the exam before and after the teaching method was introduced. In this example, the teaching method appears to have a significant positive effect on students’ performance in the mathematics exam.
Calculation of McNemar Test with R:
Code
library(exact2x2)
# Create the contingency table
alpha = 0.05
pre_test <- c(20, 15) # Passed pre-test (Yes), Failed pre-test (Yes)
post_test <- c(5, 10) # Passed post-test (No), Failed post-test (No)
contingency_table <- matrix(c(pre_test, post_test), nrow = 2, byrow = TRUE)
# Perform McNemar test
result <- mcnemar.test(contingency_table)
result
McNemar's Chi-squared test with continuity correction
data: contingency_table
McNemar's chi-squared = 4.05, df = 1, p-value = 0.04417
Code
p_value = result$p.value
if (p_value < alpha) {
cat("Reject null hypothesis\n")
} else {
cat("Do not reject null hypothesis\n")
}
Calculation of McNemar Test with Python:
Code
from statsmodels.stats.contingency_tables import mcnemar
# Define the contingency table
pre_test = [20, 15] # Passed pre-test (Yes), Failed pre-test (Yes)
post_test = [5, 10] # Passed post-test (No), Failed post-test (No)
contingency_table = [pre_test, post_test]
# Perform McNemar test
result = mcnemar(contingency_table)
print(result)
pvalue 0.04138946533203125
statistic 5.0
Code
# Extract the statistic and p-value from the result
statistic = result.statistic
p_value = result.pvalue
# Compare p-value to significance level alpha
alpha = 0.05
if p_value < alpha:
print("Reject null hypothesis")
else:
print("Do not reject null hypothesis")